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1 Introduction 


In the course of studying how to shuffle big decks of cards, G. White and I 
had to answer the following question: how long does it take to mix n red 
balls and n black balls, half of which are contained in one urn and the rest 
in another, if at each step we pick k balls from each urn and exchange them. 
The answer to that question can be found in [IT]. To study the above model, 
it was natural to ask a similar question about Ehrenfest’s urn model. More 
precisely, consider two urns. Initially urn one contains zero balls and urn two 
contains n balls. At each step, pick k total balls at random and move each 
of them to the opposite urn. 

The above Markov chain can be also viewed as a random walk on (Z/2Z) n 
where at each step we flip k random coordinates (for some fixed k). For the 
walk to be transitive, k needs to be odd, and to avoid parity problems, it 
is simplest to consider the lazy version of this walk. In other words, at 
each step, do nothing with probability 1/2 and with probability 1/2 choose 
a random set of k coordinates and flip them. The main question is to find 
the mixing time of this walk for the total variation distance. 

This non-local walk implies a big change at each step. So far people 
have mostly studied local models; in the case of the hypercube for example, 
the most famous model is the one that considers picking one coordinate at 
random and flipping it. But of course in that case the outcome after one 
step is not very different from the initial configuration, which is why mixing 
is slower. Of course, really big changes (e.g. k = n) make the mixing faster. 
A first heuristic explained below is that flipping k coordinates at each step 
should be roughly the same as moving one each time and repeating k times. 

There is a second reason why this particular random walk is interesting. 
There are two different approaches to finding the mixing time of this walk. 
The first approach is developed in Section [5] It involves finding the eigenval¬ 
ues of the walk using representation theory and using the Fourier transform 
to give bounds on the l 2 norm of the difference P* 1 — U. For the case of 
k — 1, this technique works nicely and gives a sharp upper bound on the 
mixing time. However, for k — it turns out that the bound obtained via 
the l 2 norm does not give a sharp upper bound on the mixing time, which is 
defined in terms of the total variation distance ( l 1 norm). 

A second argument via coupling is introduced in Section |3] It provides 
a solution to the general case and makes the difference between the l 2 norm 
and total variation distance clear. This coupling argument is a generalization 


2 


of one used by D. Aldous [2] for the case k = 1. See [I] for more results of 
Aldous on the hypercube. The lower bound uses the first two eigenvectors 
and eigenvalues of the random walk and the second moment technique. This 
method was firstly introduced by P. Diaconis and M. Shashahani in [7]. In 
their paper, they managed to prove a lower bound for the case k — 1 that 
matched the Aldous’ upper bound, proving in this way the existence of a 
cut-off at j(n + 1) logn. Another way to find a lower bound was proved by 
L. Saloff-Coste in [13] using Wilson’s lemma. In [5], Diaconis, Graham and 
Morisson use Fourier Analysis directly to derive the exact behavior of the 
error for the nearest neighbor random walk. 

The results of this section are the following: 

Theorem 1. For the lazy walk changing k < n/2 coordinates on the n—dimensional 
hypercube the following hold: 

1. For ( = (f logn + If + ^ + 2 ) + c x ' JT^, 

where c > 0. 


2. For l = ifr log n — c| , where 0 < c < j log n, 

ll-P*' - U\\t.v. >1-4 

e c 

for a uniformly bounded constant B > 0. 


Remark 2. It is easy to see that the mixing time for the k model and the 
n — k model will be the same, therefore we will focus on the case k < n/2. 


Here are a few computations: 

Section [7] contains the analysis for l 2 —mixing time of the random walk 
on (Z/mZ) n generated by the measure 


Q(<Ai e u T ^22^22 T • • * Oj k e lk ) 
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where a^. G TL/rnL and {A, i 2 , ■ ■ ■ ik} C {1, 2, ..., n}. 
The main result is: 
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Table 1: Examples for the A;-walk on the hypercube 


n 

k 

upper bound for 
the mixing time 

54 

27 

19 

54 

3 

576 

418 

209 

26 

418 

7 

2899 

550 

275 

27 

550 

25 

1,112 


Theorem 3. For the walk generated by Q, if l = log(mn) + ^7,^ then 

m’ l -U\?T.v.<e~ c 

It is known that the l \—mixing time is faster than that (using the fact that 
the first time that we have touched all coordinates is a strong stationary time) 
but the above result holds for the I 2 norm, which allows us to use comparison 
theory to provide bounds for the ^-mixing times of the walk on (Z/mZ) n 
generated by 

p 2(±ei) = ^ p ( id ) = \ 

More precisely, we prove a bound of the form m 2 ( I ^log(mn) + c ^ n 2 1 ~ 1 ' > ) for 
the mixing time of the last random walk. The details are included in Section 
[9] The analysis of / 2 norm of the last walk has already been done by Diaconis 
and Saloff-Coste [6], where they proved an upper bound of order m 2 n\ogn 
and then Saloff-Coste proved the cut-off [ T2] , 

2 The history of the Ehrenfest’s urn model 

The Ehrenfest’s urn model was introduced by Tatjana and Paul Ehrcnfest |S] 
to study the second law of thermodynamics. This is a model for n particles 
distributed in two containers and each particle changes container indepen¬ 
dently from the others at rate A. This process is repeated several times and 
the question is to find the limiting distribution of the process. M. Kac [9] 
approached this problem by finding the eigenvectors and eigenvalues of the 
transition matrix. He also proved that if the initial system state is not at 
equilibrium then the entropy is increasing. 
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Figure 1: Eleven particles in two containers, five of each changing containers 

This problem can also be viewed as a random walk on (Z/2Z) n where the 
ones in a binary vector represent the number of particles in the right hand 
container. Flipping one (or k) coordinates of the binary vector corresponds 
to moving one (or k) particles to the other container. But now the Markov 
Chain problem can be studied through a random walk on an abelian group, 
where representation theory is quite simple to use. As Persi Diaconis writes 
in Chapter 3 of his book [4], Kac posed the question: When can a Markov 
chain be lifted to a random walk on a group? 


3 Coupling Argument 


Consider the walk on the hypercube (Z/2Z) n : 
if g = id 

if g G (Z/2Z) n has k ones and N — k zeros 


P(s) = 



Here is the coupling argument which will provide an upper bound for the 
mixing time for k < ^: 

Start with two different copies of the Markov Chain. At time t denote 
the state of each as X{ and X^. X t will start at the identity while X 2 will 
start at a random configuration. At time t, let 


V (t) = \\x‘ - yii, = ^ ix;« - xm (i) 

l 

denote the l 1 distance between the the two conhgurations. Then consider 
the following cases: 


1. If y(t ) is odd then take one independent step on each chain according 
to the probability measure P. 
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2. If y(t) is even then with probability | stay fixed in both chains. With 
probability —Ay choose k coordinates and change X\ completely. In 

\k) 

terms of X\. look at the k coordinates that you picked. 

Definition 4. Denote by a(t ) the number of the mismatching coordi¬ 
nates among the k ones selected. 

If aft) > then change X\ completely on the k coordinates picked. 
Otherwise, if aft) < Ar at first change X\ at the coordinates that the 
two chains match among the k ones picked. Then for every mismatching 
coordinate among the k ones picked, find the next mismatched (and 
not found) mismatched coordinate out of the k ones picked, moving 
cyclically. 

Then the following lemma, which can be found in Chapter 4 of [4], says how 
the above coupling can be used to get an upper bound for the total variation 
distance: 

Lemma 5. Let T to be the first time the two chains match. 

\\Pb[-U\\ T . v .<P{T>l). 

The above lemma and Chebychev’s Inequality will be the main tools to 
prove Theorem [2] 

Remark 6. For the random walk on a group, both the l 1 and l 2 norms are 
independent of the starting state. The first chain could start at any fixed 
configuration. This is why Theorem\T\is stated for any starting configuration. 

4 Proof of Theorem |T| 

4.1 The upper bound for k = ^ 

Let’s take a look at the case of k — |, for n even: it gives insight for how 
to prove the general case.Also, the Fourier Transform argument in Section ED 
gives a worse upper bound. 

Theorem 7. If l = (81og(|) + c-^Iogf) for c > 0 then 

\\P* 1 -U\\ t . v <- 2 
cr 
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(omi'0,o,WO) 



at first the matching 
coordinates picked 
get flipped 



flip the first mismatched 
coordinate on X > 

Find the next mismatched, 

not picked coordinate and 
flip it only on X 



do the same for the last 
mismatched coordinate 
picked. 


( 0 , 0 , 0 , 1 , 0 , 0 , 1 , 0 , 0 ) 


( 0 , 0 , 0 , 1 , 0 , 0 , 1 , 0 , 0 ) 


In each chain, three coordinates were flipped, 
buty(t)=4, while y(t+l)=0 


Figure 2: This picture gives an example of how the coupling would work. 
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The following Lemma will help proving theorem [Tj Let X\ and X 2 denote 
the configuration of the first copy and the second copy of the same Markov 
Chain respectively. Also, A"° = id while X 2 is random. Denote by yt the 
number of coordinates that Aq = id and Aq differ at on time t. Let at count 
how many differing coordinates are picked at step t after starting running 
the coupling process. 

Lemma 8. With the notation above, 


1. IfVt > f then P(y t -2<at< f) > i 

2. Ifyt < f thenP( y -i<a t < y i) >± 


Proof. 


1. At first notice that if yt — | < i < y then 



Therefore using the facts that 



1 


(3) 


and that 



(4) 


equations l2ll3l and Q] give exactly that 



2. For the case where yt < f the goal is at first to prove that 



(5) 


and then that 
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( 6 ) 



To prove equation ([5]) just notice that equation (J2D is still valid for 
0 < i < Therefore 


\ = 2P (l < a t < | - l) + 2P (a t = 0) + P (a* = |) 

Then notice that 2 P (a t = 0) = 2( n k y ) < (f)(r|) — P (a t — §) and 

V 2 ' 2 7 V 2 2 7 V 

therefore equation (j5j) holds. 

To prove equation ([6]) it suffices to prove that 


which is equivalent to showing that the following inequality holds: 

,3y ,3y . . ,3y . „ w n y , w n y . . ,n y . _. 

( —+*)(—+*-l)... ( —-*+l)(--+*)( -+2-1)... ( --*+1) 

v 4 a 4 ' v 4 a 2 4 a 2 4 ' v 2 4 ’ 


^ .s,V . 3y ...n 3y . ,n 3y . 

> (j+o(|+*-i) • • • (j-'+i)( r >)( r |+'-i) • • • (j 


This is obviously true since ^ f > f + i 


n 

2 


3y 


6 and | 


I + * 


6 > 


+ i — b. 


□ 


The above lemma is the main tool to the proof of Theorem |7] 

Proof. At first, if y( 0) is odd then wait until the coupling suggests staying 
fixed in one of the chains while making moves on the other chain.The expected 
time for this to happen is 2 since this time follows the geometric distribution 
with probability of success 

To prove the theorem, consider the following two cases. 

1. If yt > f then consider a much slower process where if at time t you 
pick less than yt~\ mismatching coordinates you do nothing, while if 
you picked more than y t — f then you act as if you picked only y t — | of 
them. Now, consider W to be the first time that o. t — Vt~\ mismatched 
coordinates are picked then part 1. of Lemma [8] gives that 

E(W) < 4 
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where 4 is the expectation of the geometric with probability of success 
equal to 4. 

This means that after an average of 4 steps the number of mismatched 
coordinates is y t+2 = yt — 2(y t — |) — n — y t which in this case would 
be at most 

2. If yt < f then again consider a much slower process where if at time 
t the number of mismatched coordinates picked is at and ^ < at < 
Y then act as if you picked ^ mismatched coordinates, otherwise do 
nothing. Let Bi be the i th time that 4 of the mismatched coordinates 
is picked and i 0 be the time we picked 4 of the mismatched coordinates 
and we ended up with only 2 mismatched coordinates left. Then 1 < 
*o < log | and if B io = Bi + (B i+ 1 — Bi) is the total steps needed 

to end up with only 2 mismatched coordinates left then 

Tl 

E(BJ< Slog- 


Finally, it is important to estimate the probability of picking one mis¬ 
matched coordinate when there are only 2 mismatched coordinates in 
order to finish the proof 


P(a t = 1) 



n 

4 (n — 1) 



so again if R is the first time one of the two mismatched coordinates is 
picked then 

E(R) < 4 


Summing up, if T is the coupling time then 

TL 

E(T) <E(W) + E(B i0 ) + E(R) <81og(-) + 10 


and 


Tl 

Var(T ) < A log — 

where A is a constant. Then Chebychev’s inequality implies the rest. □ 
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4.2 Upp er Bound 

The following lemma is the key to the proof of Theorem [Q The lemma mainly 
bounds from below the probability that a sufficient number of mismatched 
coordinates are picked. 

Lemma 9. With a(t ) defined in Definition [7] and y t := y(t ) which is given 
by equation U\ 

1. P (a t — i) is increasing as a function of i when i < anc [ 

decreasing when i > yt k -™+yt+ k _ 

2. Ify t >k and ^ > 2 then P (Hfk < a t < min{^, k }) > 

3. If yt > k and 1 < ^ < 2 then P (l < a t < min{y, k }) > 

4- IfUt>k and ^ < 1 then P (l < a t < minjfc, ^-}) > 

5. Ify t >k and^ < ^ then P (l < a t < min{f, k}) > f£. 

6. If y t <k and^>2 then P (f£ < a t < f) > ±. 

7. Ifiy < k and 1 < & < 2 then P (l < a t < f) > |. 

5. If y t < k and < v -f < 1 then P (l < a t < §) > 

5. If y t < k and y -f < ^ then P (l < a t < f) > fjh 
Proof. The proof is quite technical and goes as following: 


1. Let f(i) = P(at = i), where a{t) is the number of the mismatching 
coordinates among the k ones selected. Then 


/(<) 



andj^q- < 1 if and only if i < — ■ 
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2. If ijt > k and ^ > 1 then using the part (1) of the lemma 


p { «j!i<a,<y*)>p( 0 <a t <«* 
'2 n ~ n ~ V 2 n 


( 7 ) 


and then if k < &■, it follows that 


Vtk 


- — P I 0 < a t < — + P —— < a t < k 


< 2 P 


ytk 
2 n 


2 n 


< a t < k 


Vtk 


2 n 


whereas if k > ^ then 


Vtk 


2 n 


<4 P{ y* <at <yi 

1 2n ~ 2 




Vt 


- — P 0 < a t < — + P < a t < — )+P( — < a t < k 


2 n 


yt 


where the last inequality is true because of relation ([7|) and the fact that 
the interval (y,fc] contains at most twice as many integers as [|^, ^] 
does. 


3. Now if yt > k and 1 > — < 2 then part (1) of the Lemma says that 
P (a t = 0) < P ( a t — 1) < P (l < a t < minjfc, ^}) and therefore imi¬ 
tating the proof of part (2) one gets that \ < 3P (l < a t < min{y, *})• 

4. In this case, it suffices to bound P ( a t = 0). 


= (n - k)(n -k-l)...(n-k-y t + l) < 1 
2 n(n — 1)... (n — y — t + 1) ~ 2 


1 - 


n 


yt 


< 


< 


2e~ v -£ 2V2 

where the last inequality holds because . Therefore, 


P (l<a t < minjfc, > 



v/2 — 1 
Ay/2 
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5. As above P (min{y, k} < a t < k) < P (l < a t < iniri{y, k}') and the 
goal is to bound P (at = 0). Using the fact that e~ 2x < 1 — x whenever 
x < —conclude that 

P (a t = 0) < -e~ 2 ^ < -(1 - 
v ; “ 2 “ 2 V 2n 

and therefore 

6. The proof of this case is similar to the proof of part (2) of the Lemma. 
The only difference is that at runs between 0 and yt- 


7. In this case P (a t = 0) < P (a t = 1) < P (l < a t < y) because of part 
(1) of the Lemma. Then notice again that 

p (| < at < y) < 2P (l < at < |) 

and then imitate the proof of part (2) of the Lemma. 


8. Similarly to the cases above we have that P (l < a t < |) > P (| < a t < y). 
To bound P (a t = 0), expand as : 


P (a t = 0) = 


(n — k)(n — k — 1)... (n — k — y t + 1) 1 


1 

- 2 C 


2 n(n — 1 )... (n 

vt k 1 

n < 


y-t + 1) 


1 -- 


n 


yt 


2V2 


where the last inequality holds because < —. Using the above and 

imitating the arguments from the above parts we have that P (l < a t < > 

d-75) = U2-1 
4 4\/2 


9. As above P (l < a t < |) > P (| < a t < y). To bound P (a t = 0) use 
the fact that e~ 2x <l — x whenever x < Then,the calculations of 
the previous part of the Lemma give that 


P (a t = 0) < 



< -(1 
“ 2 v 


yk_ 
2 n 


and therefore 


P (1 < at < 


yt 


> 


Vtk 
8 n 


13 



□ 


The above lemma now leads to the proof of Theorem [T} 

Proof. At first, in case that the starting number of the mismatched coordi¬ 
nates of the chains is odd wait until the coupling suggests staying fixed at 
one of them and taking a step on the other to turn the difference even. Call 
T w the time the above happens. Then T w follows a geometric distribution 
with probability of success |. So then E(T W ) = 2 and Var(T w ) = 2. 

For general k, let y t be the mismatched coordinates at time t. Also let 
a t be the number of mismatched coordinates picked at time t + 1 in running 
the coupling process. Consider the following cases: 

1. If at time t it is true that ^ > 2 then P < a t < min{y, fc}) > |. 
Then consider a much slower process where whenever picking < 
a t < min{y,fc} mismatched coordinates we act as if only |_y^J were 
picked. Then the expected number of steps to either exhaust all of the 
mismatched coordinates or fall into one of the other cases will be at 
most jf log n. 

2. If 1 < < 2 then P (l < a t < min(Y,fc}) > Therefore working 

as before, the expected number of steps to either exhaust all of the 
mismatched coordinates or fall in one of the other cases will be at most 

3 n 
k 

3. If yy < 1 then P (l < a t < min{y, k }) > so again the expected 

number of steps to either exhaust all of the mismatched coordinates or 
fall in one of the other cases will be at most . . 

(V2—1 )k 

Putting the bounds together: 


8 n, 3n 

E{T) < — log?r + — + 

rC K 


2y/2 n 

— 7 =--+ 2 

(y/2-1 )k 


and 

Ti 

Var(T ) < A— log n 
k 

where A is a constant. Then Chebychev’s Inequality yields the rest. 


□ 
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4.3 Lower Bound 

The lower bound will be proved using the eigenvectors and eigenvalues for 
this Markov Chain. Theorem 6 of [4] (page 49) says that the eigenvalues 
are the Krawtchouck polynomials and the eigenvectors are the normalized 
Krawtchouck polynomials. To see this, notice that the irreducible represen¬ 
tations of (Z/2Z) n are indexed by vectors a e (Z/2Z) n so that 

p.(v) = (-ir v 

Therefore, the Fourier transform of P at p a is 

P(p>) := £ft(v)P(v) = l + i£(-i)‘®M 

V 6=0 Vfc) 

where j denotes the number of coordinates of a that are equal to one. Accord¬ 
ing to Theorem 6 of [4], the eigenvalues of the transition matrix are exactly 
the P(p a ), a G (Z/2Z) n . The corresponding (non-normalized) eigenfunction 
is f a (x) = (—l) x ' a . Notice that all a e (Z/2Z) n that have the same number 
of zeros give the same eigenvalue. Thus, if |x| denotes the number of ones of 
x, the j th Krawtchouck polynomials 

M /"n-lxK 

/y(x) = B-i rClr'’ 

6=0 \j > 

are eigenfunctions and their normalized form will be used to compute the 
lower bound for the mixing time. 

Remember that the definition of the total variation distance is 

\\P-Q\\ =sup|P(A) -Q(A)|. 

A 

A specific set A will provide a lower bound. To find this lower bound, con¬ 
sider the normalized Krawtchouck polynomial of degree one f(x) = y/n(l — 
—) and the non-normalized Krawtchouck polynomial of degree two ;) = 
1 — -f. 4pL. Then, consider A a = {x \ |/(x)| < a}. A specihc choice for 

a will guarantee the correct lower bound. 

The orthogonality relations that the normalized Krawtchouck polynomi¬ 
als satisfy give that if Z is a point chosen uniformly in X = {0,1, 2, ..., n} 
then 

Eu{f(Z)} = 0 and Varu{f(Z)} = 1. 
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Now under the convolution measure, 


£{/(Z,)} = M ( \ + \ (l - “) 


= v n I 1- 

n 


because / is an eigenfunction of the Markov Chain corresponding to the 
eigenvalue 1 — K Again under the convolution measure, 

VarifiZt)} = 


n n(n — 1 ) / 2k 2 — 2 kn 

1 V ' ' 1+ - o- 


n 


n 


n — n 


21 

n ( 1 — - | = 

n . 


1 + (n - 1) 1 


2 kn — 2k 


n — n 


2 \ l 


n 1 


n 


21 


Recall that the hrst three (non-normalized) eigenfunctions of this Markov 
Chain are: 

2r 4r 4 t^ 

fo(x) = 1, /i(x) = 1-and f 2 (x) = 1-— + —-• 

n n — 1 — n 

By direct computation fl(x) = j-fo(x) + —-f 2 (x). Combining this and 
the fact that f 2 corresponds to the eigenvalue 1 + gives the claimed 

variance. 

Now, take l of the form ^ log n — c| (where c > 0), 

1. First case to be considered is k = |_(|J where d is a constant. To simplify 
the notation, say that k — 




and 


Var{f(Z l )} = l + (n-l) 1 — 


= 1 + {n — 1) 1 


2 kn — 2k 2 


rr — n 


— n 1 — 


n 


21 


Mi - j) 

d(n — 1) 


1 A \ 1 


— n I 1 — 


21 
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< 1 + (n - 1) 1 


2(1 - 


i\\ l 


d 


n ( 1 — — ) = 

a. 


21 


1 + (n- 1) 


d-i\ 21 (d- r 2 ' 

— n 


d ) \ d 

In this case if / = l/21og d /( d _ 1 )(n) — c then 

£{/(Z<)} > ' d 


d — 1 


so if 0 < c < | log d /( d _ 1 )(n) the expectation E{f(Zi)} can get big while 

Var{f(Zt)} < 2 

2. If there is 0 < e < 1 so that k = 0(n e ) then the mean becomes 


£{ /(Z,)}=exp(c+ 0 (^) + o(f 


which means that for 0 < c < ^ log (n/k) this expectation is big. Simi¬ 
larly for the variance 

VarifiZt)} = 

i / -i \ f 2n —2k n — k (k \ f ck 

1 + (n — 1 ) exp c-log n + (J I — log n I + CM — 

\ n — 1 n — 1 \n J \n 

— exp ( 2 c+ O + O 


< 1 + O I - ) + e 2c ( O I ] + O 


/ k log n 


n 


ck 

n 


Therefore the variance is uniformily bounded for 0 < c < | log (n/k). 

In both cases, Chebyshev’s inequality gives that for the set A a = {x : 

1 / 0*01 < «}, 

U(A a ) >1-4 while P*\A a ) < B 

a- (e zc — a ) z 

where B is uniformly bounded when 0 < c < 1 log n. 
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Therefore, 


I P 


*i 


U\\t.v. >1-o 


B 


a* 


(e 2c — a) 2 


Now, take a — Lp- which finishes the proof. 


5 Fourier Transform Arguments 

In this section, a different approach is introduced. It combines the represen¬ 
tation theory of the hypercube and the Fourier Transform to provide a bound 
for the mixing time. All of the irreducible representations of the hypercube 
are one dimensional and they are indexed by z G (Z/2Z) n in the following 
way: 

Pz(w) = (-l) z ' w 

where z • w is the inner product of z, w G (Z/2Z) n . 

The Fourier Transform of a probability P at a representation p is defined 
as: _ 

p(p)= E p (3)p(a) 

se(z/2Z)« 

which in our case means 


j I't'i ( n ~3) 

PM = P + (1 - p) = P + (! - P) A ?( fc ) (8) 

a =0 \k) 

where j is the number of ones that z has and K r - (k) is the j th Krawtchouck 
polynomial evaluated at k. 

In Chapter 3 of [3] , one can find the Upper Bound Lemma (Lemma 1 in 
the book) which shows how using the Fourier transform of the representations 
of a group to find an upper bound for the mixing time of a walk on the group. 
More precisely, the upper bound lemma in the case of the hypercube (or in 
general for (Z/pZ) K says: 

Lemma 10. (Upper Bound Lemma) For a random walk on the hypercube, 
after l steps: 

^\p“-ni.v. <E( p w) 2 ' 

z^O 
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6 The case k=n/2 


In the case where n is even with n = 2k where A; is a positive,odd integer, 
the following facts hold: 

Lemma 11. For k — | the Fourier Transform of representation ptextbfz is 
given by 


i 

2 ’ 


P (/u) = S i 

where j is the number of ones that z has. 


if j is odd 
if g=2i 


Proof. Acccording to Koekoek and Swarttouw in [10] the j th Krawtchouck 
satisfies the following recurrence relation: 

I ?7 77 

-kK?(k) = -(n - i)Kf +1 {k) - -Kf{k) + -K^k) 
which for k — gives that 


K\ 


,n s 


0, 

. / n \ 

(-U(?) 


if j is odd 

if j=2i 


given that Kf(k) = 1 and Kf{k) = 1. 


□ 


The next step is to bound the eigenvalues and use the Upper Bound 
Lemma to actually get an upper bound for the L 2 norm: 

Lemma 12. For every representation p z where z^ 0, 


l^(P.)l<4 


(9) 


Proof. If j is the number of ones z has then if j odd the theorem holds 

. / n \ 

^ 1 1 —l 1 7) 

because P(p z ) — If j — 2i the quantity £ H- d /' is the main concern. 

\2eJ 

For i odd, the second term is negative but bigger than —1/2 therefore the 

1 (?) 

quantity is positive less than tj • For i even it turns out that c t = is 

maximized for i — | — 1 and at most by a simple argument. Thus for % 


even, ~ + 


. / 71 \ 

(-b-d) ^ 3 


<!■ 


□ 
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Theorem 13. For k — % and for l = nlo f 2 } oge with 0 < e < 1, 

2 J log 3 


4||P* — U\\^_ v . < e 


Proof. After having computed the Fourier transform of each representation 
and bounded them in Lemma fT2l the upper bound lemma gives: 


3 = 1 


iff < 2 n ( - 
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< e 


for l = n lo f 2 ~ 4 1os £ . □ 

log 3 

Remark 14. Notice that for the l 2 norm, 

IGIII-P" - t/||l > Q) E 

i is odd v 

so indeed the l 2 norm cannot give a better upper bound for the mixing time. 
However, the l 1 norm may still be small for smaller l. 


7 Similar Random Walks 

This section focuses on similar random walks on (Z/mZ) n . It uses Fourier 
Transforms and comparison theory methods to bound their mixing times. 


8 A random walk on (Z/mZ) n 


Consider the walk generated by the measure 


Q( a h e ii -\- ai 2 ei 2 ai k ei k ) 


1 

CK 


where a tj £ '. Z/mZ and {ii,i 2 ,... ik} C {1, 2,..., n }. 
Here is a proof of Theorem [3j 


Proof. Let p a denote a representation of (Z/mZ) n , where a £ (Z/mZ) n . If 
9 = ( 91 , ■ ■ ■,9n) e (2/mZ) n then 
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Then the Fourier transform of Q at this representation with respect to Q is 


1 v-, 27ri £?=l a j9j 

QM = TnTTTT E - 


O* 




where \g\ denotes the number of positions that g t is not equal to zero. Since 




if n — |a| > k, 


e m =mS 0taj , 


rid 

o 


QM = 


otherwise Q(p a ) = 0. These are the eigenvalues of the random walk that are 
not equal to 1. 

Now notice that all of the eigenvalues are non-negative and in particular 


C- t H ) 

G) 


= u-* 

n 


k 


n — 1 


... 1 


k 


✓ —fclog— n , + u.i I 1 

< e n-H + l =1 — 


n — |a| + 1 

k 


— b V 1 ” i 

< e K z^j=n-|a|+l j 


a 


n + 1 


Then, the Upper Bound Lemma (Lemma fT0|) gives that 


n—k / s. / . \ 2 kl 

n\, /_ j 




if/ 


n—k a 

3 =1 

!^log(mn) + 


—) 

n+lj 


2 kl 


< e~ c 


□ 


Remark 15. Notice that T, the first time that all coordinates have been 
touched is a strong stationary time, which implies that the total variation 
distance needs order | log n steps to get small. To see this one can imitate 
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the calculation for the coupon collector problem as presented in Lemma 2 of 
Further, notice that 


\G\\\Q* 1 


U H 2 > n(m 




2 kl 


which means that the l 2 norm needs at least ifr \og{mn ) + steps to get 

small. Therefore there is a gap between the separation distance and the l 2 
norm mixing times. 


9 Comparison Theory Application 


Comparison theory can help provide an upper bound for the following exam¬ 
ple: 


Example 16. With notation as in Theorem 0 consider the case k = 1. 
Then, 

Q{bei) = — 
run 

for b G Z/mZ and 1 < i < n. The walk is ”pick a coordinate at random and 
randomize if. Theorem 0 states that if l = |((n + 1) log (mn) + c[n + 1)) 
then 

m"‘- U\\ 2 t , v , < e~ c 

But then comparison theory gives the following theorem for the mixing time 
of the random walk generated by 

p ^ = b p w = l 

Theorem 17. Let l = ^m 2 ((n + 1) log(mn) + c(n + 1) then 


W? - U\\ 2 t.v. <(l + ^)e _c 

Proof. Let S = {±€j,id} and S' = {bej,b G Z/mZ}. According to P. 
Diaconis and L. Saloff-Coste [6] if we represent each z G S' as a product 
of elements of S that has odd length and 

A = ™ a s x pTai ^ IN l N ( z > 0) 

’ z&S’ 
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where ||z|| is the length of this representation and N(z,s) is the number of 
times s is inside the representation of z then 

4||P 2 *' - U\\ 2 tv . < m n e~ l/A + m n \\Q* l/2A - U\\\ 

An easy argument shows that A < max{m 2 , A -g 2m} = m 2 therefore if 
l = + 1) log(mra) + c(n + 1)), 

M\p? - u\\ 2 T . v . < ^ + e - c 

□ 
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